Filtering Relevant Text Passages Based on Lexical Cohesion
نویسندگان
چکیده
Monitoring news and blogs has become a promising application for global operating groups, who are interested in recognizing topic developments in a fragmented topic landscape. News articles especially long ones may consist of several topics or different aspects of the same topic. In terms of Topic Detection and Tracking (TDT) it is hard to figure out the topic development in a stream of news or blog articles with the scope of a certain information need since articles often contain only a limited amount of the relevant information. In this paper we address the problem of filtering relevant portions of text, commonly known as passage retrieval, by using linear text segmentation methods based on lexical cohesion. We present two strategies for passage retrieval and compare their performance with cohesion based approaches – TextTiling (cf. [Hea97]) and TSF (cf. [KG09]) – developed in the context of linear text segmentation.
منابع مشابه
On document relevance and lexical cohesion between query terms
Lexical cohesion is a property of text, achieved through lexical-semantic relations between words in text. Most information retrieval systems make use of lexical relations in text only to a limited extent. In this paper we empirically investigate whether the degree of lexical cohesion between the contexts of query terms’ occurrences in a document is related to its relevance to the query. Lexica...
متن کاملA Study of Document Relevance and Lexical Cohesion between Query Terms
Lexical cohesion is a property of text, achieved through lexicalsemantic relations between words in text. Most information retrieval systems make use of lexical relations in text only to a limited extent. In this paper we empirically investigate whether the degree of lexical cohesion between the contexts of query terms' occurrences in a document is related to its relevance to the query. Experim...
متن کاملBiased LexRank: Passage retrieval using random walks with question-based priors
We present Biased LexRank, a method for semi-supervised passage retrieval in the context of question answering. We represent a text as a graph of passages linked based on their pairwise lexical similarity. We use traditional passage retrieval techniques to identify passages that are likely to be relevant to a user’s natural language question. We then perform a random walk on the lexical similar...
متن کاملTrade-Off between Factors Influencing Quality of the Summary
Our summarization approach is based on the assumption that quality of the summary is influenced by a set of factors, dependent on lexical and grammatical features of text units selected and arranged while composing the summary. The system has been developed with taking into account six factors influencing the final quality: compliance with the genre "summary", relevance, focusing, compliance wi...
متن کاملLexical Cohesion Based Topic Modeling for Summarization
In this paper, we attack the problem of forming extracts for text summarization. Forming extracts involves selecting the most representative and significant sentences from the text. Our method takes advantage of the lexical cohesion structure in the text in order to evaluate significance of sentences. Lexical chains have been used in summarization research to analyze the lexical cohesion struct...
متن کامل